AITopics | binary data

Collaborating Authors

binary data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distributed Flexible Nonlinear Tensor Factorization

Shandian Zhe, Kai Zhang, Pengyuan Wang, Kuang-chih Lee, Zenglin Xu, Yuan Qi, Zoubin Ghahramani

Neural Information Processing SystemsMay-1-2026, 05:55:37 GMT

Tensor factorization is a powerful tool to analyse multi-way data. Recently proposed nonlinear factorization methods, although capable of capturing complex relationships, are computationally quite expensive and may suffer a severe learning bias in case of extreme data sparsity. Therefore, we propose a distributed, flexible nonlinear tensor factorization model, which avoids the expensive computations and structural restrictions of the Kronecker-product in the existing TGP formulations, allowing an arbitrary subset of tensorial entries to be selected for training. Meanwhile, we derive a tractable and tight variational evidence lower bound (ELBO) that enables highly decoupled, parallel computations and high-quality inference. Based on the new bound, we develop a distributed, key-value-free inference algorithm in the MAPREDUCE framework, which can fully exploit the memory cache mechanism in fast MAPREDUCE systems such as SPARK. Experiments demonstrate the advantages of our method over several state-of-the-art approaches, in terms of both predictive performance and computational efficiency.

artificial intelligence, machine learning, tensor, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

The normalization of (almost) everything: Our minds can get used to anything, and even crises start feeling normal Science

ScienceNov-6-2025, 14:01:00 GMT

For a long time, many climate scientists and advocates held onto an optimistic belief that once the impacts of climate change became undeniable, people and governments would act. But whereas the predictions of climate models have increasingly borne out, the assumptions about human behavior have not. Even as disasters mount, climate change remains low on voters' priority lists, and policy responses remain tepid. To me, this gap reflects a deeper failure--not just in policy or communication, but in how we understand human adaptability. When I began my career as a computational cognitive scientist, I was drawn to a defining strength of human cognition--a marked ability to adapt.

adaptation, climate change, experiment, (16 more...)

Science

Country: North America > United States > California > Los Angeles County > Los Angeles (0.15)

Industry: Health & Medicine (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

Differentiable Structure Learning for General Binary Data

Deng, Chang, Aragam, Bryon

arXiv.org Machine LearningSep-29-2025

Existing methods for differentiable structure learning in discrete data typically assume that the data are generated from specific structural equation models. However, these assumptions may not align with the true data-generating process, which limits the general applicability of such methods. Furthermore, current approaches often ignore the complex dependence structure inherent in discrete data and consider only linear effects. We propose a differentiable structure learning framework that is capable of capturing arbitrary dependencies among discrete variables. We show that although general discrete models are unidentifiable from purely observational data, it is possible to characterize the complete set of compatible parameters and structures. Additionally, we establish identifiability up to Markov equivalence under mild assumptions. We formulate the learning problem as a single differentiable optimization task in the most general form, thereby avoiding the unrealistic simplifications adopted by previous methods. Empirical results demonstrate that our approach effectively captures complex relationships in discrete data.

discrete data, graph, multibernoulli, (14 more...)

arXiv.org Machine Learning

2509.21658

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Binary and Ternary Quantization Can Enhance Feature Discrimination

Lu, Weizhi, Chen, Mingrui, Li, Weiyu

arXiv.org Artificial IntelligenceJul-14-2025

Quantization is widely applied in machine learning to reduce computational and storage costs for both data and models. Considering that classification tasks are fundamental to the field, it is crucial to investigate how quantization impacts classification performance. Traditional research has focused on quantization errors, assuming that larger errors generally lead to lower classification accuracy. However, this assumption lacks a solid theoretical foundation and often contradicts empirical observations. For example, despite introducing significant errors, $\{0,1\}$-binary and $\{0, \pm1\}$-ternary quantized data have sometimes achieved classification accuracy comparable or even superior to full-precision data. To reasonably explain this phenomenon, a more accurate evaluation of classification performance is required. To achieve this, we propose a direct analysis of the feature discrimination of quantized data, instead of focusing on quantization errors. Our analysis reveals that both binary and ternary quantization can potentially enhance, rather than degrade, the feature discrimination of the original data. This finding is supported by classification experiments conducted on both synthetic and real data.

artificial intelligence, machine learning, quantization, (15 more...)

arXiv.org Artificial Intelligence

2504.13792

Country: North America (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Archetypal Analysis for Binary Data

Wedenborg, A. Emilie J., Mørup, Morten

arXiv.org Machine LearningFeb-6-2025

Archetypal analysis (AA) is a matrix decomposition method that identifies distinct patterns using convex combinations of the data points denoted archetypes with each data point in turn reconstructed as convex combinations of the archetypes. AA thereby forms a polytope representing trade-offs of the distinct aspects in the data. Most existing methods for AA are designed for continuous data and do not exploit the structure of the data distribution. In this paper, we propose two new optimization frameworks for archetypal analysis for binary data. i) A second order approximation of the AA likelihood based on the Bernoulli distribution with efficient closed-form updates using an active set procedure for learning the convex combinations defining the archetypes, and a sequential minimal optimization strategy for learning the observation specific reconstructions. ii) A Bernoulli likelihood based version of the principal convex hull analysis (PCHA) algorithm originally developed for least squares optimization. We compare these approaches with the only existing binary AA procedure relying on multiplicative updates and demonstrate their superiority on both synthetic and real binary data. Notably, the proposed optimization frameworks for AA can easily be extended to other data distributions providing generic efficient optimization frameworks for AA based on tailored likelihood functions reflecting the underlying data distribution.

archetype, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

2502.04172

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > Montserrat (0.04)
Europe > Denmark (0.04)
Asia > India > Telangana > Hyderabad (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Tabular Data Generation using Binary Diffusion

Kinakh, Vitaliy, Voloshynovskiy, Slava

arXiv.org Artificial IntelligenceOct-28-2024

Generating synthetic tabular data is critical in machine learning, especially when real data is limited or sensitive. Traditional generative models often face challenges due to the unique characteristics of tabular data, such as mixed data types and varied distributions, and require complex preprocessing or large pretrained models. In this paper, we introduce a novel, lossless binary transformation method that converts any tabular data into fixed-size binary representations, and a corresponding new generative model called Binary Diffusion, specifically designed for binary data. Binary Diffusion leverages the simplicity of XOR operations for noise addition and removal and employs binary cross-entropy loss for training. Our approach eliminates the need for extensive preprocessing, complex noise parameter tuning, and pretraining on large datasets. We evaluate our model on several popular tabular benchmark datasets, demonstrating that Binary Diffusion outperforms existing state-of-the-art models on Travel, Adult Income, and Diabetes datasets while being significantly smaller in size.

latexit sha1, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.13882

Country:

North America > United States > California (0.06)
Europe > Switzerland > Geneva > Geneva (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.89)

Add feedback

Distributed Flexible Nonlinear Tensor Factorization §, Kai Zhang †, Pengyuan Wang ‡, Kuang-chih Lee

Neural Information Processing SystemsMar-12-2024, 14:59:08 GMT

algorithm, latent factor, tensor, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Beyond Language Models: Byte Models are Digital World Simulators

Wu, Shangda, Tan, Xu, Wang, Zili, Wang, Rui, Li, Xiaobing, Sun, Maosong

arXiv.org Artificial IntelligenceFeb-29-2024

Traditional deep learning often overlooks bytes, the basic units of the digital world, where all forms of information and operations are encoded and manipulated in binary format. Inspired by the success of next token prediction in natural language processing, we introduce bGPT, a model with next byte prediction to simulate the digital world. bGPT matches specialized models in performance across various modalities, including text, audio, and images, and offers new possibilities for predicting, simulating, and diagnosing algorithm or hardware behaviour. It has almost flawlessly replicated the process of converting symbolic music data, achieving a low error rate of 0.0011 bits per byte in converting ABC notation to MIDI format. In addition, bGPT demonstrates exceptional capabilities in simulating CPU behaviour, with an accuracy exceeding 99.99% in executing various operations. Leveraging next byte prediction, models like bGPT can directly learn from vast binary data, effectively simulating the intricate patterns of the digital world.

instruction, language model, sequence, (14 more...)

arXiv.org Artificial Intelligence

2402.19155

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > China (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.93)

Add feedback

Bitformer: An efficient Transformer with bitwise operation-based attention for Big Data Analytics at low-cost low-precision devices

Duan, Gaoxiang, Zhang, Junkai, Zheng, Xiaoying, Zhu, Yongxin

arXiv.org Artificial IntelligenceNov-22-2023

In the current landscape of large models, the Transformer stands as a cornerstone, playing a pivotal role in shaping the trajectory of modern models. However, its application encounters challenges attributed to the substantial computational intricacies intrinsic to its attention mechanism. Moreover, its reliance on high-precision floating-point operations presents specific hurdles, particularly evident in computation-intensive scenarios such as edge computing environments. These environments, characterized by resource-constrained devices and a preference for lower precision, necessitate innovative solutions. To tackle the exacting data processing demands posed by edge devices, we introduce the Bitformer model, an inventive extension of the Transformer paradigm. Central to this innovation is a novel attention mechanism that adeptly replaces conventional floating-point matrix multiplication with bitwise operations. This strategic substitution yields dual advantages. Not only does it maintain the attention mechanism's prowess in capturing intricate long-range information dependencies, but it also orchestrates a profound reduction in the computational complexity inherent in the attention operation. The transition from an $O(n^2d)$ complexity, typical of floating-point operations, to an $O(n^2T)$ complexity characterizing bitwise operations, substantiates this advantage. Notably, in this context, the parameter $T$ remains markedly smaller than the conventional dimensionality parameter $d$. The Bitformer model in essence endeavors to reconcile the indomitable requirements of modern computing landscapes with the constraints posed by edge computing scenarios. By forging this innovative path, we bridge the gap between high-performing models and resource-scarce environments, thus unveiling a promising trajectory for further advancements in the field.

attention mechanism, edge device, opération, (14 more...)

arXiv.org Artificial Intelligence

2311.13502

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.48)

Industry: Energy (0.68)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Learning of Mixtures of Multiple Causes in Binary Data

Neural Information Processing SystemsApr-6-2023, 18:52:54 GMT

This paper presents a formulation for unsupervised learning of clus(cid:173) ters reflecting multiple causal structure in binary data. Unlike the standard mixture model, a multiple cause model accounts for ob(cid:173) served data by combining assertions from many hidden causes, each of which can pertain to varying degree to any subset of the observ(cid:173) able dimensions. A crucial issue is the mixing-function for combin(cid:173) ing beliefs from different cluster-centers in order to generate data reconstructions whose errors are minimized both during recognition and learning. We demonstrate a weakness inherent to the popular weighted sum followed by sigmoid squashing, and offer an alterna(cid:173) tive form of the nonlinearity. Results are presented demonstrating the algorithm's ability successfully to discover coherent multiple causal representat.ions of noisy test data and in images of printed characters.

binary data, cid, unsupervised learning

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.66)

Add feedback